differentiable group normalization
Towards Deeper Graph Neural Networks with Differentiable Group Normalization
Graph neural networks (GNNs), which learn the representation of a node by aggregating its neighbors, have become an effective computational tool in downstream applications. Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases. It is because the stacked aggregators would make node representations converge to indistinguishable vectors. Several attempts have been made to tackle the issue by bringing linked node pairs close and unlinked pairs distinct. However, they often ignore the intrinsic community structures and would result in sub-optimal performance. The representations of nodes within the same community/class need be similar to facilitate the classification, while different classes are expected to be separated in embedding space. To bridge the gap, we introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN). It normalizes nodes within the same group independently to increase their smoothness, and separates node distributions among different groups to significantly alleviate the over-smoothing issue. Experiments on real-world datasets demonstrate that DGN makes GNN models more robust to over-smoothing and achieves better performance with deeper GNNs.
Review for NeurIPS paper: Towards Deeper Graph Neural Networks with Differentiable Group Normalization
Weaknesses: (1) Empirical results seem to be weak compared to other works [1] aiming at tackling over-smoothing problem. According to table 1, Deep GNNs with DGN outperform those with other normalization mechanisms. However, the performance degradation still exists when the GNNs are made deeper. Though the idea is somewhat incremental, the proposed Differentiable Group Normalization relates it indeed. However, the Instance Information Gain employ mutual information between the input features and output representations as a metric, which seems to be somewhat weird. According to the Appendix F, the output representation is taken from the final prediction layer, which is the result of a linear transformation applied to the top hidden features.
Review for NeurIPS paper: Towards Deeper Graph Neural Networks with Differentiable Group Normalization
This paper presents a method to address the over-smoothing issue in deep graph neural network with differentiable group normalization and metrics to measure over-smoothing. All reviewers agree that this paper tackles an important problem and the empirical results verify the main claim of the paper. The reviewers raised some issues regarding comparisons with previous work. However, the authors noted that these papers are relatively recent (available publicly in April 2020 or later). I consider these papers to be parallel work and therefore understand why the authors did not compare with them in the current version.
Towards Deeper Graph Neural Networks with Differentiable Group Normalization
Graph neural networks (GNNs), which learn the representation of a node by aggregating its neighbors, have become an effective computational tool in downstream applications. Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases. It is because the stacked aggregators would make node representations converge to indistinguishable vectors. Several attempts have been made to tackle the issue by bringing linked node pairs close and unlinked pairs distinct. However, they often ignore the intrinsic community structures and would result in sub-optimal performance. The representations of nodes within the same community/class need be similar to facilitate the classification, while different classes are expected to be separated in embedding space.
Towards Deeper Graph Neural Networks with Differentiable Group Normalization
Zhou, Kaixiong, Huang, Xiao, Li, Yuening, Zha, Daochen, Chen, Rui, Hu, Xia
Graph neural networks (GNNs), which learn the representation of a node by aggregating its neighbors, have become an effective computational tool in downstream applications. Over-smoothing is one of the key issues which limit the performance of GNNs as the number of layers increases. It is because the stacked aggregators would make node representations converge to indistinguishable vectors. Several attempts have been made to tackle the issue by bringing linked node pairs close and unlinked pairs distinct. However, they often ignore the intrinsic community structures and would result in sub-optimal performance. The representations of nodes within the same community/class need be similar to facilitate the classification, while different classes are expected to be separated in embedding space. To bridge the gap, we introduce two over-smoothing metrics and a novel technique, i.e., differentiable group normalization (DGN). It normalizes nodes within the same group independently to increase their smoothness, and separates node distributions among different groups to significantly alleviate the over-smoothing issue. Experiments on real-world datasets demonstrate that DGN makes GNN models more robust to over-smoothing and achieves better performance with deeper GNNs.
- North America > United States > Texas > Brazos County > College Station (0.04)
- Asia > China > Hong Kong (0.04)